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Abstract. A sequence is nonrepetitive if it does not contain two adjacent 
identical blocks. The remarkable construction of Thue asserts that 3 symbols 
are enough to build an arbitrarily long nonrepetitive sequence. It is still not 
settled whether the following extension holds: for every sequence of 3-element 
sets Li, . . , , L n there exists a nonrepetitive sequence si, . . . , s n with s; 6 
We propose a new non-constructive way to build long nonrepetitive sequences 
and provide an elementary proof that sets of size 4 suffice confirming the best 
known bound. The simple double counting in the heart of the argument is 
inspired by the recent algorithmic proof of the Lovasz local lemma due to Moscr 
and Tardos. Furthermore we apply this approach and present game-theoretic 
type results on nonrepetitive sequences. Nonrepetitive game is played by two 
players who pick, one by one, consecutive terms of a sequence over a given set 
of symbols. The first player tries to avoid repetitions, while the second player, 
in contrast, wants to create them. Of course, by simple imitation, the second 
player can force lots of repetitions of size 1. However, as proved by Pegden, 
there is a strategy for the first player to build an arbitrarily long sequence over 
37 symbols with no repetitions of size greater than 1. Our techniques allow to 
reduce 37 to 6. Another game we consider is the erase-repetition game. Here, 
whenever a repetition occurs, the repeated block is immediately erased and 
the next player to move continues the play. We prove that there is a strategy 
for the first player to build an arbitrarily long nonrepetitive sequence over 8 
symbols. 



1. Introduction 

A repetition of size h in a sequence S is a subsequence of consecutive terms of S 
consisting of two identical blocks x\ . . . x^xi . . . Xh- A sequence is nonrepetitive if it 
does not contain a repetition of any size ft, ^ 1. For instance, the sequence 1232312 
contains a repetition 2323 of size two, while 123132123 is nonrepetitive. 

It is easy to see that each binary sequence of length at least four contains a 
repetition. In 1906 Thue [24] proved that 3 symbols are sufficient to produce 
arbitrarily long nonrepetitive sequences (see [6]). His method is constructive and 
uses substitutions over a given set of symbols. For instance, the substitution 

1 -> 12312 

2 -> 131232 

3 -> 1323132 



preserves the property of nonrepetitiveness on the set of finite sequences over 
{1,2,3}. This means that replacing all symbols in a nonrepetitive sequence by 
the assigned blocks results in a sequence that still does not contain repetitions. 
Sequences generated by substitutions have found many unexpected applications in 
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such diverse areas as group theory, universal algebra, number theory, ergodic the- 
ory, and formal language theory. The work of Thue inspired a stream of research 
leading to emergence of new branches of mathematics with a variety of challenging 
open problems (see [1, 5, 7, 14, 19]). 

In this paper we present a different approach to creating long nonrepetitivc 
sequences. Consider the following naive procedure: generate consecutive terms of a 
sequence by choosing symbols at random (uniformly and independently) and every 
time a repetition occurs, erase the repeated block and continue. For instance, if the 
generated sequence is 12323, we must cancel the last two symbols, which brings us 
back to 123. 

We prove by a simple counting that with positive probability the length of a 
constructed sequence exceeds any finite bound, provided the number of symbols is 
at least 4. This is slightly weaker than Thue's result, but our argument remains 
valid in more general settings, in which the method of substitutions does not seem 
to work. 

One particular example of such a setting is the list- version of nonrepetitive se- 
quences - an analog of the classical graph choosability introduced by Vizing [25] 
and independently by Erdos, Rubin, and Taylor [9]. Suppose we are given a col- 
lection of lists (sets of symbols) Li,... ,L n . A sequence s\ . . . s n is chosen from 
lists Li, ... , L n if Si £ Li for all i = 1, . . . , n. The following list- version of Thue's 
theorem seems plausible. 

Conjecture 1. For every n ^ 1 and a sequence of sets L\, . . . , L n , each of size 3, 
there is a nonrepetitive sequence chosen from Li, ... , L n . 

Notice that the statement of the conjecture is not obvious, even for lists of any 
given size. However, a rather straightforward touch of the Lovasz local lemma as- 
sures that the conjecture is true for sufficiently large lists (for a careful introduction 
to the local lemma and the probabilistic method in general we send the reader to 
[3]). In fact, the bound 64 comes as a special case of a result on nonrepetitive col- 
orings of bounded degree graphs (Alon et al. [2]; see also [13]). Recently Grytczuk, 
Przybylo and Zhu [15] proved that lists of sizes at least 4 suffice. They achieve this 
almost tight bound applying an enhanced version of the local lemma due to Pegden 
[21]. In Section 2 we give a simple argument for the same bound. 

This research would not emerge without a contribution of Moser on his way to an 
algorithmic proof of Lovasz local lemma [20]: his entropy compression argument. 
This was widely discussed in the combinatorics community and we send the reader 
to great expositions of the topic by Tao [22] and Fortnow [12]. 

In this paper we make use of the above-mentioned approach to games involving 
nonrepctitve sequences. 

The nonrepetitive game over a symbol set S is played by two players in the 
following way. The players collectively build a sequence choosing from S, one by 
one, consecutive terms of the sequence. The first player, Ann, is trying to avoid 
repetitions, while the second player, Ben, does not necessarily cooperate. Of course, 
just by mimicking Ann's moves Ben can force a lot of repetitions of size 1. It turns 
out however that for large enough S he cannot force any larger repetition at all! 
Pegden [21], using his extension of the Lovasz local lemma, proved that Ann has 
a strategy in the nonrepetitive game to build an arbitrarily long sequence with no 
repetition of size greater than 1 over symbol set of size at least 37 (no matter how 
perfidiously Ben is playing). In this paper we prove (Theorem 3) that Ann can do 
the same on every set of symbols of size at least 6. On the other hand, Ben can 
easily force nontrivial repetitions in a game on just 3 symbols (see [21]). Thus, the 
minimum size of a set of symbols required to ensure Ann's strategy is 4, 5 or 6. 
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The erase- repetition game over a set of symbols S is also a two-player game 
between Ann and Ben. As before they build a sequence picking symbols alternately 
from S and appending them to the end of the sequence built so far. But this 
time whenever a repetition occurs the second instance of the repeated block is 
immediately erased and the next player continues extending the remaining prefix 
of the sequence. We prove (Theorem 2) that there is a strategy for Ann in this 
game to build an arbitrarily long nonrepctitive sequence over at least 8 symbols. 

The paper is organized as follows. Section 2 contains the generic argument 
proving that from any sequence of lists, each of size 4, one can choose a nonrepctitive 
sequence. Section 3 introduce a bit of generating functions theory used in counting 
arguments. Sections 4 and 5 are devoted to erase repetiton game and nonrepetitive 
game, respectively. 

2. The algorithm 

Consider the following randomized algorithm. The input is a sequence of lists 
Li, ... , L n . Random elements are chosen independently with uniform distribution. 



Algorithm 1: Choosing a nonrepetitive sequence from lists of size 4 
i 1 

while i ^ n do 

Si <— random element of Li 

if si, . . . , Si is nonrepetitive then 

i <- i + 1 
else 

there is exactly one repetition, say Si_2h+i, . . . , s,_/i, Si^h+i, • ■ • , Si 
i <- i — h + I 



The general idea is that if Algorithm 1 works long enough for all evaluations of 
the random experiments, then a lot of repetitions occur, based on which we can 
compress a random string to a better extent than is actually possible. 

Theorem 1. For every n ^ 1 and a sequence of sets Li, . . . , L n , each of size 4, 
there is a nonrepetitive sequence chosen from Li,..., L n . 

Proof. Suppose for a contradiction that it is not possible to choose a nonrepetitive 
sequence from L\,..., L n . This means that Algorithm 1 does not terminate on this 
sequence. In the following, by the j-th step of the algorithm, we mean the j-th 
iteration of the while loop. Set M to be a sufficiently large integer. We are going 
to record, in two different ways, the possible scenarios of what algorithm does in 
the first M steps. 

Order arbitrarily the elements of each Li. In each step the algorithm picks a 
random element from a list of size 4. Let rj (1 ^ j ^ M) be the position of the 
chosen element in the appropriate list. Clearly, n, . . . , Vm is a sequence of random 
variables with 4 M possible evaluations. When we fix evaluations of n, . . . , rjvf we 
make Algorithm 1 deterministic. 

For fixed evaluations of ri , . . . , r n , let di = 1 and dj (2 ^ j ^ M) be the difference 
between the values of variable i after jth and (j — l)th steps of the algorithm. The 
important properties are: 

(i) dj^l, for all Hj^ M, 

(ii) Ej=i ^ ^ 1, for all 1 < k < M. 
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A pair (D, S) is a log if there is an evaluation of (ri, . . . , Tm) such that D is the 
corresponding sequence of differences and S is the final sequence produced after M 
steps of the algorithm. The key point is that a log encodes all values of 1*1 , ... , Tm 
in a lossless fashion. 

Claim. Every log corresponds to a unique evaluation of r\,. .. ,tm- 

Proof of Claim. Given a log {{d\, . . . , <1m), Sm) with Sm = ( s i, ■ ■ ■ , si) we are going 
to decode the evaluation of tm (the last random choice taken) and Sm—i ~ the 
sequence constructed after M — 1 steps. Then by simple iteration one can extract 
all the remaining values of rjif-i, ■ ■ ■ ,T\. 

If dnj = 1 then, the element generated in the Mth step is appended to the end of 
Sm- Thus the value of Vm is simply the position of s; in L;. Moreover, no repetition 
occurred after the Mth step and therefore Sm-i = ( s i ; ■ ■ ■ , 

If dM ^ then some symbols were erased after the Mth step. But, since we 
know the size of the repeated sequence, namely h = |<i/\/| + 1, and only one part of 
it was erased, we can read and copy the appropriate block to restore the sequence 
before the erasure (si, . . . , s;, s/„/ l+ i, . . . , s;). Then we read the value of Tm as a 
position of s\ in Li+hi an d Sm—i as ( s ii ■ ■ ■ > s h si-h+i, ■ ■ ■ , (in case of h = 

we put Sm-i = (si,...,si)). □ 

Let Tm be the number of sequences d\ , . . . , <1m satisfying (i) , (ii) and additionally 
^2jLi dj — 1- Such sequences are in close relation to plane trees, and arc known to 
be enumerated by Catalan numbers, i.e., 7m+i = Cm = jt+iCm) = °(4 M )- Note 
that every feasible sequence of differences in a log has total sum less than n (as 
Algorithm 1 never terminates). The number of sequences satisfying (i), (ii) but 
with total sum equal k (fixed k ^ 1) is at most Tm- Thus, wc conclude that the 
number of all feasible difference sequences of size M is at most n ■ Tm- Clearly, for 
every feasible sequence of differences D the number of sequences which can occur 
in log with D is at most 4™. Since the number of logs is exactly 4 M we get 

4 M < n ■ T M ■ 4" = o(4 M ) 

which is a contadiction for large enough M. This means that the number of real- 
izations which do not generate a nonrepetitive sequence of length n is smaller than 
the number of all realizations. □ 

3. Preliminaries 

Wc make some use of generating functions theory Wc consider only algebraic 
functions. A generating function t(z) = J2 n T n z n with positive radius of conver- 
gence is algebraic if there exists a nonconstant polynomial P(z, t) € C[z, t] (defining 
polynomial) such that P(z,t(z)) is constantly zero within the disc of convergence 
of t(z). It is a well known fact that, if the radius of convergence of 'Yl in T n z n is 
strictly greater than a, then T„ = o(a~ n ). The following observation is fundamen- 
tal in analysis of algebraic generating functions, the thorough study of which can 
be found in [11] (chapter VII. 7). 

Observation. Let t(z) = T n z n be a nonpolynomial algebraic generating func- 
tion with defining polynomial P(z, t). Then the radius of convergence of t(z) is one 
of the roots of the discriminant of P(z,t) with respect to the variable t (i.e. the 
resultant of P(z,t) and dtP(z,t) with respect to t). 

The coefficients of the functions we use are nonncgativc integers. In such cases 
it is easy to see that the radius of convergence is not greater than 1, whenever a 
function has infinite number of nonzero coefficients. In order to bound the growth 
of the sequence of coefficients of such a function, we calculate the discriminant of its 
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defining polynomial P(z,t) with respect to the variable t, and look for its positive 
real root in the interval (0, 1]. If there is only one such root, it must be the radius 
of convergence of the function. 

4. The erase-repetition game 

Theorem 2. In the erase-repetition game over a symbol set of size 8, there exists 
a strategy for Ann to build an arbitrarily long nonrepetitive sequence. 

Proof. We fix n and prove that Ann has a strategy to build a nonrepetitive sequence 
of size n. In fact, the strategy for Ann will be randomized and we will show that 
for every strategy of Ben there is an evaluation of random experiments leading to 
the sequence of size n against that strategy. The fact that for every strategy of Ben 
there is a strategy for Ann to build a sequence of size n implies that Ann simply 
has a strategy to build such a sequence. 

Let C be the size of a symbol set. The argument to be presented turns out to 
work for C ^ 8. The strategy for Ann is the following: choose a random element 
distinct from the last three symbols in the sequence constructed so far. In this 
setting, Ann does not generate repetitions of size 1, 2 and 3. Obviously, Ben can 
cause many repetitions of size 1 but repetitions of size 2 and 3 are not possible. 
Indeed, in order to get a repetition of the form 'abcabc' the last three symbols 
must be generated by Ben. Consider Ann's move just before Ben puts 'b' in the 
repeated block. As she could not play preceding symbol 'a' she must have invoked 
a repetition. But all her repetitions are of size at least 4 and therefore the repeated 
block must have ended with 'abca'. This would mean that she played 'a' which 
is not possible as this symbol is not distinct from the last three in the current 
sequence at that step. Analogous argument proves that repetitions of size 2 are 
also impossible. 

Fix n and a strategy of Ben. Take M sufficiently large and consider possible 
scenarios of the first 2M moves of the game against that fixed Ben's strategy. 
Suppose, for a contradiction, that the size of a sequence after 2M moves is always 
(for any evaluation of Ann's choices) less than n. Ann generates exactly M elements. 
Let tj (1 < j < M) be the jth symbol generated by Ann. Clearly, n, . . . , ru is a 
sequence of random variables with at least (C — 3) M possible evaluations. When 
we fix an evaluation of r±, . . . , ryi the course of the whole game is determined. 

Let hj (1 ^ j sC 2M) be the length of the sequence generated after j moves 
(including possible erasure invoked by the jth move) and let d\ , . . . , dim be the 
sequence of differences: d\ = 1, dj = hj — hj-i for 2 ^ j ^ 2M. Note that dj = 1 
means that there is no erasure after jth move and dj < 1 indicates that repeated 
block of size \dj\ + 1 was removed. A pair (D, S) is a game log and D is feasible if 
there is an evaluation of r\ , . . . , tm such that D is the sequence of differences and S 
is the final sequence produced after 2M moves. A pair (D, S) is a reduced game log 
if it is a log but with all zeros in D erased. Note that any sequence of differences 
D = (d\, . . . , d m ) in a reduced log satisfies: 

(i) m 2M, 

(ii) dj e {1, -3, -4, -5, . . .}, for all 1 sC j m, 

(iii) dj ^ 1, for all 1 ^ k ^ m. 

Claim. Every reduced log corresponds to a unique evaluation of r\,. . . , rj\/. 

Proof. Given a reduced log {{d\, . . . , d m ), S m ) with S m = (si, . . . , s;), we decode 
all random choices taken by Ann in two steps. First we reconstruct the sequence 
Xi, . . . , x m of all symbols introduced in the game except those (of Ben) generating 
repetitions of size 1. The introduced symbols generating repetitions of size 1 are 
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called bad, other symbols are good. The move is good (bad) if a good (bad) symbol 
is introduced. The number of good moves played is exactly the size of the difference 
sequence in the reduced log, namely m. Note that S m is the sequence formed after 
the mth good move (even if a bad move is played afterwards, it does not change 
the sequence). 

We reconstruct the sequence of good symbols backwards, i.e., we first decode 
x m , which is the last good symbol introduced, and the sequence S m -i constructed 
after m — 1 good moves. Then, by simple iteration, we extract all the remaining 
good symbols x m -i, • 

If d m = 1, then the mth good symbol introduced did not invoke a repetition. 
Thus, the last good symbol introduced is the last symbol of the final sequence, i.e. 
x m = si and S m _i = (si, . . . , sj_i). 

If d m 0, then some symbols were erased after the mth good move. But since 
we know the size of the repetition, namely h = \d m \ + 1, and only one half of it 
was erased, we can read and copy the first part of the repeated block to restore 
S m -i = (si, . . . , si,si-h+i, • ■ • , si_i) and x m = s t . 

Once we get all Xi, . . . , x m , we read the sequence from the beginning and check 
whether the symbols agree with the strategy of Ben wc fixed. The difference appears 
only where Ben introduces a bad symbol. There we extend the sequence with 
this symbol and continue. This way we reconstruct the sequence of all symbols 
introduced in the game and clearly every second symbol is chosen by Ann. □ 

By a game walk we mean a sequence d± , . . . , d m satisfying (ii) , (iii) and addi- 
tionally y~]j—i dj = 1. Let T m be the number of gamcwalks of length m. By our 
assumption that Ann never wins, every feasible sequence of differences in a reduced 
log sums up to a number smaller than n. The number of sequences of size m 
satisfying (ii), (iii) but with a total sum k (for fixed k ^ 1) is bounded by T TO+ 3 
(just append two 'l's and '— (fc + 1)' to the end). Note also that T m «C T m+ \ for 
m > 1. Indeed, for a given sequence d\, . . . , d m let i be the least index with d, < 
(there must be such provided m > 1). Then d\, . . . , di-i, 1, di + 1, dj+i, . . . , d m is 
a sequence counted by T m+ i and this extension is injective. Finally, all feasible 
sequences of differences are of size at most 2M. All this yields that the number of 
feasible difference sequences in a reduced log is at most 2M ■ n ■ T2A/+3 ■ For a given 
feasible sequence of differences D, the number of final sequences which can occur 
with D in a reduced log is bounded by C™. Thus, the number of reduced logs is 
bounded by 

2M ■ n ■ T 2M +3 • C n . 
We turn to the approximation of T2M- Every game walk di, . . . , d m is either a 
single step up (i.e., m = 1, d± = 1), or it can be uniquely decomposed into |c? m | + 1 
subsequent game walks of total length m—1. The jth component of the decomposi- 
tion is the substring between the last visit of height j — 1 and the last visit of height 
j (i.e. between the last k such that Xa=i ^ = i — 1 an d l as t I such that y . —1 di = j). 
This description together with the fact that if m > 1, then \dia\ + 1^4, certify 
that the generating function t(z) = ^2 n€N T n z n satisfies the following functional 
equation: 

t(z) = z + z(t{zf + t{zf + ...), 
where the right hand side is z+£ y~t7i) ■ From this equation we extract a polynomial 

P(z, t) = zt 4 + t 2 - (1 + z)t + z 

that defines t(z). In the standard way we calculate the discriminant polynomial 
obtaining: 

-4 - 19z + 32z 2 - 2z 3 + 36z 4 + 229z 5 . 
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This polynomial has only one positive real root equal to p = 0.457 . . . > 5 2 . Pick 
any a with p- 2 < a < 5. Then T 2M = o(a M ). 

By the claim the number of realizations is exactly the number of reduced logs. 
That gives 

(C - 3) M < 2M ■ n ■ T 2M+3 ■ C n = 2M ■ n ■ o{a M ) ■ C n = o(5 M ). 
Thus for C > 8 and sufficiently large M we obtain a contradiction. □ 

5. The nonrepetitive game 

Theorem 3. In the nonrepetitive game over a symbol set of size 6, there is a 
strategy for Ann to build an arbitrarily long sequence with no repetitions of size 
greater than 1. 

Proof. We fix n and prove that Ann has a strategy to build a sequence of size 
n without repetitions of size greater than 1. As before we consider randomized 
Ann's strategy and we show that for every strategy of Ben there is an evaluation of 
random experiments leading to the generation of a nonrepetitive sequence of size 
n. This means that Ben cannot have winning strategy Therefore, there exists a 
winning strategy for Ann. 

In this proof, by a repetition we mean a repetition of size greater than 1. 

Let si,...,s m _i be the sequence already generated in the game and suppose 
that it is Ann's turn (m is odd). The strategy for Ann goes as follows: choose any 
symbol at random, but 

(i) exclude s m _ 2 , 

(ii) if s m _i = s m _ 4 , then exclude s TO _ 3 , 

(iii) if only one symbol has been excluded in (i) and (ii), then exclude s m _4. 

This stategy explicitly ensures that no repetitions of size 2 and 3 occur in the 
game. It turns out that also repetitions of size 4 are avoided. Suppose for a con- 
tradiction that at some point in the game a sequence with a suffix of the form 
Xix 2 xsX4X\X 2 X3X4 is produced. Suppose also that Ann introduces the last symbol, 
namely X4. As she did not prevent a repetition of size 4, the rule (iii) of the strat- 
egy did not exclude a symbol and therefore rule (ii) must have been invoked. In 
particular, X3 = X4. But this means that in the previous move of Ann (when she 
introduced x 2 in the repeated block) the symbols excluded by (i) and (ii) were the 
same, so, rule (iii) must have been applied. But that rule excludes x 2 , sl contradic- 
tion. Analogous reasoning works for the case when Ben finishes a repetition of size 
4. 

Fix a strategy for Ben. We simulate the play between randomized Ann and this 
fixed strategy, and whenever a repetition of size h occurs in the mth move (of the 
real game), we backtrack to the move m — h + 1. This means that we remove 
the whole repeated segment and continue the simulation starting from the move 
m — h + 1 again (with independent random experiments). 

A search sequence is the sequence of consecutive symbols chosen by players in 
the simulation. Note that it is not possible for Ben to introduce three symbols in 
a row in the search sequence. Indeed, if he introduces two symbols in a row, then 
there must have been a repetition (of odd size) after the first symbol. Thus, the 
second one is the same as the symbol just erased at this position (as Ben's strategy 
is fixed in the simulation). This means that the second symbol could not generate 
repetition and therefore Ann is next to play in the simulation. 

The weight of a search sequence is the number of symbols chosen by Ann in the 
sequence. Fix M large enough. We are going to show that there is a scenario of 
the first M random experiments (first M moves of Ann) leading the simulation to 
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an outcome sequence of size n. This will prove that Ann has a strategy to build a 
sequence of size n against the fixed strategy of Ben. For a contradiction we suppose 
that all outcome sequences generated after M moves of Ann in the simulation are 
of length less than n for all possible evaluations of random experiments. 

Clearly, a search sequence of weight M is uniquely determined by the sequence 
of M Ann's choices. Let n, . . . , Tm be the symbols chosen by Ann. As she always 
chooses one symbol out of at least C — 2 symbols, the sequence n, . . . ,Tm has at 
least (C — 2) M possible evaluations. A search sequence induced by an evaluation 
of 7'i, . . . , rjvf is called a realization of this evaluation. 

Let hj be the length of the current sequence just before the jth step (move) of 
the simulation. The sequence (hj) is called a height sequence. If Ann introduces a 
symbol in the jth step, then her next move is in step k 6 {j + 1, j + 2, j + 3} (as 
Ben never plays three times in a row). There are only few possible extensions of 
the height sequence from hj up to hk- 

(0) Ann makes no repetition in the jth step and Ben makes no repetition in the 
(j + l)th step. In this case k = j + 2 and hj+i = hj + 1, hj + 2 = hj + 2. 

(1) Ann makes a repetition of odd size, at least 5, in the jth step and therefore 
she plays again in the (j + l)th step. Here k — j + 1 and hk ^ hj — 4. 

(2) Ann makes a repetition of even size, at least 6, in the jth step and Ben plays 
no repetition in the (j + l)th step. Here k = j + 2 and hk hj — 4. 

(3) Ann makes no repetition in the jth step and Ben produces a repetition of even 
size, at least 6, in the (j + l)th step. Here again k = j + 2 and hk ^ hj — 4. 

(4) Ann makes no repetition in the jth step. Ben makes a repetition of odd size, 
at least 5, in the (j + l)th step. Then he plays no repetition in the (j + 2)th 
step. Here k = j + 3 and hk ^ hj — 2. 

We want to get rid of some redundancy in the height sequence. More precisely, we 
encode the sequence of heights into its subsequence consisting of hj 's corresponding 
to Ann's moves with a little extra information. Let hj , hk be again the heights of 
the current sequence right before any two consecutive moves of Ann. Note that 

* If hk > hj , then the sequence of heights between hj and hk is of type (0) . 

* If hk = hj — 2, then the sequence of heights between hj and hk is of type (4). 

* If hk ^ hj — 4. then the sequence of heights between hj and hk is of type (1). 
(2),(3) or (4). ' 

Therefore, in order to record the whole height sequence it is enough to remember the 
subsequence h[, . . . , h' M of heights corresponding to Ann's moves and additionally, 
if h'j +1 ^ hj — 4, to record type(h'j,hj +1 ) G {1,2,3,4}, which is the type of the 
original height sequence between symbols corresponding to hj and h'j +1 . 

Finally, note that all the h'^'s are even (as the current sequence before Ann's 
move contains an even number of symbols). The reduced sequence of differences is: 
di = 1, dj + i = (h'j +1 — h'j)/2 for 1 ^ j < M, and the type function type(dj+i) = 
type ( ft. j, hj + i), provided the latter is defined. Note that 

(i) dj^l, 

(ii) J2j=i d 3 > !> for all 1 < fc < M, 

(iii) type(<iy) is defined if and only if dj ^ —2. 

A pair ((-D, type), S) is a search log if there is an evaluation of ri, . . . , tm such 
that D is the reduced sequence of differences in the realization of 7*1 , ... , ?'a/ , type 
is a type function of D, and S is the final sequence produced after M steps of Ann 
in this realization of the search procedure. 

Claim. Every search log corresponds to a unique evaluation of r\, . . . , rjvf. 
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Proof. Given a search log ((D, type^,), S) where S = (si, . . . , s;) we decode the eval- 
uation of ri, . . . , rjvf in a few steps. First we extract the height sequence hi, . . . , h m 
from (Djtype^,) and put additionally h rn+ \ = \S\. Now, we are going to describe 
how to reconstruct the sequence Xi, . . . , x m of all symbols introduced in the simula- 
tion. This is done in backward direction, i.e., wc decode first x rn and the sequence 
SVn-i constructed after m — 1 steps of the simulation. Then by simple iteration we 
extract all the remaining symbols x m _i, . . . , x\. 

If h m +i — h m = 1, then the introduction of x m did not invoke a repetition. Thus, 
x m is the last symbol in the final sequence S, i.e., x m = si and S m -i — (si, . . . , s;-i). 

If h m+ i — h m ^ 0, then some symbols were erased after the introduction of 
x m . But we know the size of the repetition, namely h = |/i m +i — h m \ + 1, and 
since only one half of it was erased, we can copy the appropriate block to restore 
Sm-i = (si) ■ • • , si,si-h+i, • ■ • , s;_i) and x m = s/. 

Once we get all xi, . . . , x m we read the sequence from the beginning and track 
the current sequence in the simulation. Every time the current sequence is of even 
length the next symbol is introduced by Ann. □ 

By a typed search walk we mean a pair ((di, . . . ,c?M) ) type) satisfying (i), (ii), 
(iii) and additionally X^=i dj = 1. Let Tm be the number of typed search walks of 
length M (i.e., D is of length M). By our assumption that Ann never wins, every 
feasible sequence of differences in a typed search walk sums up to less than n. The 
number of typed search walks of length M satisfying (i) , (ii) , (iii) with total sum k 
(fixed k ^ 1) is at most Tm+i (just append — (k — 1) to the end and pick arbitrary 
type, if necessary). Furthermore, T m ^ T m+ \ for m > 1. All this implies that the 
number of feasible typed search walks is n ■ Tm+i- For a given feasible typed search 
walk (D,type) the number of final sequences which can occur with (D,type) in a 
search log is bounded by C n . Thus, the number of reduced logs is bounded by 

n-T M+ i-C n . 

We turn to the approximation of T m . Every typed search walk ((<2i, . . . , d m ), type) 
is either a single step up (i.e., m = 1, d\ = 1), or it can be uniquely decomposed 
into \d m | + 1 subsequent search walks of total length m — 1 and additionally into the 
type of d m if it is defined (i.e., if d m ^ —2). This decomposition (analogous as in 
the proof of Theorem 3) gives the following functional equation for the generating 
function t(z): 

t(z) = z + zt 2 {z) + 4z(i 3 (z) + t 4 (z) + t 5 (z) + ...), 

where z stands for a trivial one-step- up walk, zt 2 (z) stands for the case d m = — 1 
in which d m has no type, and the last term stands for the case d m ^ —2. The right 
hand side of the equation is in fact equal to z + zt 2 (z) + 4z ■ From that form 
we derive the defining polynomial for t{z): 

P(z, t) = -t + t 2 + z -tz + t 2 z + 3t 3 z. 

In the standard way we calculate the discriminant polynomial obtaining: 

l + 12z- 2Az 2 - 80z 3 - 288z 4 . 

The radius of convergence of t(z) is one of the roots of the above polynomial. 
This polynomial has only one positive real root in p = 0.2537.. > 4 _1 . Therefore 
T M = o(4 M ). 

By the claim, the number of realizations is exactly the number of search logs. 
That gives 

(C - 2) M n ■ T M +i ■ C n = o(4 M ). 
Therefore for C ^ 6 and sufficiently large M we obtain a contradiction. □ 
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6. Final remarks 

The expected running time of the algorithm is linear in n for lists of size at 
least 4. It is immediate for lists of size 5, and needs a little effort for size 4. The 
computational experiments suggests different behaviour for size 3. This somehow 
explains the difficulty of Conjecture 1. It might be also the case that the list version 
of Thue's theorem does not hold, as it goes with the list version of the Four Color 
Theorem, although every planar graph is colorable from lists of size 5 [23]. 

It is natural to try a similar approach for other Thue-type problems, especially 
for those in which the Lovasz local lemma has been previously successfully applied. 
One such topic concerns graph-theoretic analogues of nonrepetitive sequences. A 
coloring of the vertices of a graph G is nonrepetitive if sequences of colors on all 
simple paths of G are nonrepetitive. The minimum number of colors needed is 
denoted by tt(G). This parameter is bounded for graphs with bounded degree [2], 
as well as for graphs with bounded treewidth [4], [18]. A major challenge of this 
area is to settle whether ir(G) is bounded by a constant for all planar G. 

The ideas behind the erase- repetition algorithm already led to the proof [17] that 
for every tree and lists of size 4 one can choose a coloring with no three consecutive 
identical blocks on any simple path. This fits to the recent construction from [10] 
proving that no constant-size of lists guarantees a nonrepetitive coloring of a tree 
chosen from these lists. 

Another direction is to look for stronger versions of nonrepetitive sequences. Here 
is an interesting variation due to Erdos [8] . A sequence S is strongly nonrepetitive if 
no two adjacent blocks of S are permutations one of another. It is known that there 
are arbitrarily long strongly nonrepetitive sequences over four symbols [16]. But is 
it true that one can choose strongly nonrepetitive sequences from any collection of 
lists of sufficiently large size? 
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